Computing Statistical Profiles of Active Sites in Proteins
نویسندگان
چکیده
Active sites in proteins are three dimensional substructures that cause them to perform their function. The problem of finding substructures in a protein that are “similar” to the active sites of another protein has several important applications in biological sciences such as drug design, genetic engineering, and diagnostic tools for analysis of genetically engineered pathogens. Active sites can be grouped into families whose members are related by similarity of their functions. Since similar sites exhibit variability in their physico-chemical and structural features, statistical profiling methods capture the shared features robustly in the presence of such variations. In this paper, we adapt Profile Hidden Markov Models (PHMMs) that have been successfully used for analyzing biological sequences, to statistically profile active site families. Since PHMMs can only profile one dimensional sequences, we develop a serialization of the three dimensional active sites that captures certain shared physico-chemical and geometric features of the family. PHMM parameters are learnt using these serialized sequences. While traditional PHMM learning algorithms deal with discrete physico-chemical feature only, we expand it to include geometric features drawn from a continuous probability distribution. Experimental results with our PHMM based method for profiling active sites suggest that it is effective in practice.
منابع مشابه
Automatic classification of highly related Malate Dehydrogenase and L-Lactate Dehydrogenase based on 3D-pattern of active sites
Accurate protein function prediction is an important subject in bioinformatics, especially wheresequentially and structurally similar proteins have different functions. Malate dehydrogenaseand L-lactate dehydrogenase are two evolutionary related enzymes, which exist in a widevariety of organisms. These enzymes are sequentially and structurally similar and sharecommon active site residues, spati...
متن کاملThe Effect of Starvation Stress on the Protein Profiles in Flexibacter chinensis
Background: Analysis of many proteins produced during the transition into the stationary phase and under stress conditions (including starvation stress) demonstrated that a number of novel proteins were induced in common to each stress and could be the reason for cross-protection in bacterial cells. It is necessary to investigate the synthesis of these proteins during different stress condition...
متن کاملActive site nature of magnesium dichloride-supported titanocene catalysts in olefin polymerization
Heterogeneous Ziegler-Natta and homogeneous metallocene catalysts exhibit greatly different active sitenature in olefin polymerization. In our previous study, it was reported that MgCl2-supported titanocenecatalysts can generate both Ziegler-Natta-type and metallocene-type active sites according to the type of activators.The dual active site nature of the supported titanocene catalysts was furt...
متن کاملMonte Carlo Simulation of a Linear Accelerator and Electron Beam Parameters Used in Radiotherapy
Introduction: In recent decades, several Monte Carlo codes have been introduced for research and medical applications. These methods provide both accurate and detailed calculation of particle transport from linear accelerators. The main drawback of Monte Carlo techniques is the extremely long computing time that is required in order to obtain a dose distribution with good statistical accuracy. ...
متن کاملEffects of FeCl3 doping on the performance of MgCl2/TiCl4/DNPB catalyst in 1-hexene polymerization
The aim of this study was to examine the effect of catalyst doping on the performance of MgCl2. EtOH/TiCl4 catalyst system. In this regard, a series of undoped as well as FeCl3-doped catalysts was prepared and employed in 1-hexene polymerization. A modified catalyst containing 10 wt. % of FeCl3 dopant demonstrated the highest activity, with 32% activity increase compared to unmodified one, amon...
متن کامل